My final project seeks to examine the spread of Covid-19 in Latin America. The internet has been overrun with misinformation about the pandemic, so the importance of effective and accurate messaging around the communication of the virus is incredibly important.
In order to compare the growth in cases between different countries, it’s important to somehow standardize the growth timelines so that the effectiveness of each country’s response can be judged fairly in reference to other countries. To do this, I created a new variable for each country that marks the number of days since the country reached 100 cases. This allowed me to start each country at the same x and y starting point, providing an easier comparison between country cases.
In this view, it’s clear that Brazil’s cases have consistently increased faster than cases in other Latin American countries. It’s also interesting to monitor other countries that have struggled to slow the spread of the virus, such as Ecuador and, increasingly, Peru. As more countries move further from their first reported 100 cases, it will be interesting to monitor their growth rates to see if they demonstrate the exponential relationship seen in many countries around the world.
The figure in the next tab demonstrates the daily new cases in each South American country.
This figure shows the new cases each day after March 14th in South American countries. What’s especially striking is the rapidly undulating daily new case counts in countries like Ecuador, Chile, Peru, and Brazil. These variances undermine trust in the accuracy of the reported data and suggest that many cases are not being reported efficiently to central authorities.
This emphasizes one of the most difficult aspect of tracking the spread of this virus; the lack of widespread testing and reporting of cases means that it is almost impossible to truly understand the scope of the pandemic in a country. Furthermore, many governments see it as beneficial to minimize the scope of the disease in their country, pushing them to avoid counting many deaths as related to Covid-19 or decide that asymptomatic cases should not be included in the official total.
The next tab shows the initial linear regression results regarding different country factors that could impact the spread of Covid-19.
These linear regression results suggest that population size and various healthcare system factors have the most significant impacts on the total number of cases within a country. I am going to add additional factors that might influence the spread of the virus, such as the type of party leadership currently in control and whether social distancing / lockdown measures were put into place in a timely manner and/or enforced.
##
## Call:
## lm(formula = running_sum ~ gini_2017 + pop_millions + gdp_billions +
## health_exp_gdp_2016 + out_pocket_exp_2016 + private_exp_pp_2016,
## data = b)
##
## Residuals:
## 1 2 3 4 5 9 13 16 17 18
## 227.10 515.07 115.76 -640.46 -637.26 484.04 -470.59 843.48 -52.56 -75.53
## 20
## -309.04
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13105.221 4463.491 2.936 0.042554 *
## gini_2017 -353.883 83.151 -4.256 0.013096 *
## pop_millions 383.174 91.714 4.178 0.013942 *
## gdp_billions -28.453 9.018 -3.155 0.034345 *
## health_exp_gdp_2016 -1787.972 309.652 -5.774 0.004467 **
## out_pocket_exp_2016 318.660 36.586 8.710 0.000957 ***
## private_exp_pp_2016 32.016 4.103 7.804 0.001455 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 777.5 on 4 degrees of freedom
## (10 observations deleted due to missingness)
## Multiple R-squared: 0.9947, Adjusted R-squared: 0.9868
## F-statistic: 126 on 6 and 4 DF, p-value: 0.0001651